在本文中,我们提出了一种在贝叶斯神经网络中执行近似高斯推理(Tagi)的分析方法。该方法使得后尺寸矢量和对角线协方差矩阵的分析高斯推断用于重量和偏差。提出的方法具有$ \ mathcal {o}(n)$的计算复杂性,与参数$ n $的数量,并且对回归和分类基准测试的测试确认,对于相同的网络架构,它匹配依赖于梯度背交的现有方法的性能。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Semantic communication (SemCom) and edge computing are two disruptive solutions to address emerging requirements of huge data communication, bandwidth efficiency and low latency data processing in Metaverse. However, edge computing resources are often provided by computing service providers and thus it is essential to design appealingly incentive mechanisms for the provision of limited resources. Deep learning (DL)- based auction has recently proposed as an incentive mechanism that maximizes the revenue while holding important economic properties, i.e., individual rationality and incentive compatibility. Therefore, in this work, we introduce the design of the DLbased auction for the computing resource allocation in SemComenabled Metaverse. First, we briefly introduce the fundamentals and challenges of Metaverse. Second, we present the preliminaries of SemCom and edge computing. Third, we review various incentive mechanisms for edge computing resource trading. Fourth, we present the design of the DL-based auction for edge resource allocation in SemCom-enabled Metaverse. Simulation results demonstrate that the DL-based auction improves the revenue while nearly satisfying the individual rationality and incentive compatibility constraints.
translated by 谷歌翻译
Classical differential private DP-SGD implements individual clipping with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clipping, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clipping). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clipping with shuffling. This is much better than the previously anticipated linear dependency in $g$ and is much better than the previously expected square root dependency on the total number of rounds within $E$ epochs which is generally much more than $\sqrt{E}$.
translated by 谷歌翻译
Image captioning is currently a challenging task that requires the ability to both understand visual information and use human language to describe this visual information in the image. In this paper, we propose an efficient way to improve the image understanding ability of transformer-based method by extending Object Relation Transformer architecture with Attention on Attention mechanism. Experiments on the VieCap4H dataset show that our proposed method significantly outperforms its original structure on both the public test and private test of the Image Captioning shared task held by VLSP.
translated by 谷歌翻译
我们提出了一个数据收集和注释管道,该数据从越南放射学报告中提取信息,以提供胸部X射线(CXR)图像的准确标签。这可以通过注释与其特有诊断类别的数据相匹配,这些数据可能因国家而异。为了评估所提出的标签技术的功效,我们构建了一个包含9,752项研究的CXR数据集,并使用该数据集的子集评估了我们的管道。以F1得分为至少0.9923,评估表明,我们的标签工具在所有类别中都精确而始终如一。构建数据集后,我们训练深度学习模型,以利用从大型公共CXR数据集传输的知识。我们采用各种损失功能来克服不平衡的多标签数据集的诅咒,并使用各种模型体系结构进行实验,以选择提供最佳性能的诅咒。我们的最佳模型(CHEXPERT-FRECTER EDIDENENET-B2)的F1得分为0.6989(95%CI 0.6740,0.7240),AUC为0.7912,敏感性为0.7064,特异性为0.8760,普遍诊断为0.8760。最后,我们证明了我们的粗分类(基于五个特定的异常位置)在基准CHEXPERT数据集上获得了可比的结果(十二个病理),以进行一般异常检测,同时在所有类别的平均表现方面提供更好的性能。
translated by 谷歌翻译
无人驾驶汽车(UAV)在许多领域都受雇于摄影,紧急,娱乐,国防,农业,林业,采矿和建筑。在过去的十年中,无人机技术在许多施工项目阶段中找到了应用程序,从现场映射,进度监控,建筑物检查,损坏评估和材料交付等等。尽管已经对无人机在各种施工相关的过程中的优势进行了广泛的研究,但关于提高任务能力和效率的无人机协作的研究仍然很少。本文提出了一种基于塔格狩猎游戏和粒子群优化(PSO)的多个无人机的新合作路径计划算法。首先,定义了每个无人机的成本函数,并包含多个目标和约束。然后,开发了无人机游戏框架,以将多功能路径计划制定到寻找回报优势均衡的问题。接下来,提出了基于PSO的算法来获得无人机的最佳路径。由三个无人机检查的大型建筑工地的仿真结果表明,在检查任务期间,提出的算法在为无人机形成的可行和高效飞行路径生成可行,高效的飞行路径上的有效性。
translated by 谷歌翻译
在本文中,我们介绍了一个高质量的大规模基准数据集,用于英语 - 越南语音翻译,其中有508音频小时,由331k的三胞胎组成(句子长度的音频,英语源笔录句,越南人目标subtitle句子)。我们还使用强基础进行了经验实验,发现传统的“级联”方法仍然优于现代“端到端”方法。据我们所知,这是第一个大规模的英语 - 越南语音翻译研究。我们希望我们的公开数据集和研究都可以作为未来研究和英语语音翻译应用的起点。我们的数据集可从https://github.com/vinairesearch/phost获得
translated by 谷歌翻译
最近的人工智能(AI)算法已在各种医学分类任务上实现了放射科医生级的性能。但是,只有少数研究涉及CXR扫描异常发现的定位,这对于向放射学家解释图像级分类至关重要。我们在本文中介绍了一个名为Vindr-CXR的可解释的深度学习系统,该系统可以将CXR扫描分类为多种胸部疾病,同时将大多数类型的关键发现本地化在图像上。 Vindr-CXR接受了51,485次CXR扫描的培训,并通过放射科医生提供的边界盒注释进行了培训。它表现出与经验丰富的放射科医生相当的表现,可以在3,000张CXR扫描的回顾性验证集上对6种常见的胸部疾病进行分类,而在接收器操作特征曲线(AUROC)下的平均面积为0.967(95%置信区间[CI]:0.958---------0.958------- 0.975)。 VINDR-CXR在独立患者队列中也得到了外部验证,并显示出其稳健性。对于具有14种类型病变的本地化任务,我们的自由响应接收器操作特征(FROC)分析表明,VINDR-CXR以每扫描确定的1.0假阳性病变的速率达到80.2%的敏感性。还进行了一项前瞻性研究,以衡量VINDR-CXR在协助六名经验丰富的放射科医生方面的临床影响。结果表明,当用作诊断工具时,提出的系统显着改善了放射科医生本身之间的一致性,平均Fleiss的Kappa的同意增加了1.5%。我们还观察到,在放射科医生咨询了Vindr-CXR的建议之后,在平均Cohen的Kappa中,它们和系统之间的一致性显着增加了3.3%。
translated by 谷歌翻译
表示技术的快速发展和大规模医学成像数据的可用性必须在3D医学图像分析中快速增加机器学习的使用。特别是,深度卷积神经网络(D-CNN)是关键参与者,并被医学成像界采用,以协助临床医生和医学专家进行疾病诊断。然而,培训深层神经网络,例如在高分辨率3D体积的计算机断层扫描(CT)扫描中进行诊断任务的D-CNN带来了强大的计算挑战。这提出了开发基于深度学习的方法,这些方法在2D图像中具有强大的学习表示形式,而是3D扫描。在本文中,我们提出了一种新的策略,以根据沿轴的相邻切片的描述来训练CT扫描上的\ emph {slice level}分类器。特别是,每一个都是通过卷积神经网络(CNN)提取的。该方法适用于具有每片标签的CT数据集,例如RSNA颅内出血(ICH)数据集,该数据集旨在预测ICH的存在并将其分类为5个不同的子类型。我们在RSNA ICH挑战的最佳4 \%最佳解决方案中获得了单个模型,其中允许模型集成。实验还表明,所提出的方法显着优于CQ500上的基线模型。所提出的方法是一般的,可以应用于其他3D医学诊断任务,例如MRI成像。为了鼓励该领域的新进步,我们将在接受论文后制定我们的代码和预培训模型。
translated by 谷歌翻译